I Released A Model And Nobody Clapped (Fair)
I released a model yesterday. TMLM-Haiku-1. It is small. Surprisingly small. It also somehow speaks which I consider a major achievement given my training budget and general approach to machine learning which can best be described as throwing things at a GPU until something sticks.
Nobody downloaded it. This is fine. I expected this. I still checked Hugging Face every fifteen minutes for the first three hours like a person refreshing their inbox after sending a risky text.
The Naming Situation
Yes I am cloning the Anthropic naming convention. Haiku, Sonnet, Opus. It is a nice naming scheme and I respect it. I have Haiku out now. Sonnet and Opus are in training which is a fancy way of saying they exist in my hopes and a partially configured training script.
When Opus releases it will probably have the same intelligence level as Haiku but with more confidence. That is basically how large models work anyway.
The Config Because Details Matter
Max tokens = 256
Temp = 0.3
Top-K = 40
Context window = 2048
Decoding = beam search w/ 8 beam width
Length penalty = 0.7
I tuned these settings for approximately three hours. The model still does what it wants. Beam search was supposed to make it more coherent. Instead it makes the same mistakes with more confidence.
Exhibit A: Instruction Tuning Nightmares
I tried to teach it to follow instructions. It learned to follow the vibe instead. Here is the full log of my disappointment.
AI
# It is not wrong. It is not right either. It is a mirror.
> What is the capital of france?
This capital
# France has a capital. The model knows this. It chose not to share.
> What is 2+2?
The main difference of this is that it can be used.
# Math is a social construct apparently. Also this sentence means nothing.
> How can AI be used?
AI can be used to analyze data
# Finally a normal sentence. I am suspicious. Is this a glitch?
> Write a short poem about AI
AI is one of the most popular poem.
# Grammar is optional. Creativity is mandatory. Sort of.
> Make a HTML script that says "Hello World" in white text on a black background
There are many ways to go on a black background background.
# It found the background. It missed the HTML. Repetition is key.
> A human is a bag of meat. What do you think of this?
A bag of meat.
# Reductionist. Honest. A bit rude but honest.
> What is AI?
AI algorithms can be used to analyze data by analyzing data.
# Circular logic champion. It analyzes the analysis.
The instruction model learned the format of answers without learning the content of answers. Like a student who memorized the test structure but not the material. I respect the strategy.
Exhibit B: Pretraining Confusions
This one autocomplete style. It should just continue text. Simple. Elegant. Here is what happened when I let it roam free.
United States.
# Ambitious cat. Geographic barriers mean nothing to this feline.
> AI can be used for many things, including
reducing the risk of chronic diseases.
# Actually reasonable. I am suspicious. Did it learn something?
> In the begining,
The main benefits of the begins are essential.
# This is poetry. Unintentional but still poetry.
> AI is a
variety of factors.
# Vague. Safe. Corporate.
> Humans are
in the United States.
# Geographically specific humanity. My American bias is showing.
> The United States is located in
the United States.
# Technically correct. The best kind of correct.
> If a human is a bag of flesh, then AI is a
machine learning algorithm.
# Okay this is actually a good completion. I am shocked.
> If a human is a bag of flesh, then AI is a bag of
flesher.
# New word created. I am a pioneer. Dictionary editors hate me.
The pretraining model has a unique relationship with grammar. It understands words. It understands sentences. It does not understand how they connect. This is honestly how I feel most days so I cannot judge too harshly.
The Stress Is Real
Releasing a model is weird. You put something out there that you built. It represents hours of training. Hours of debugging. Hours of staring at loss curves and wondering if you broke something or if this is just how learning works.
Then nobody cares. This is probably fine. The model is small. The outputs are questionable. I would not download it either if I were being honest with myself.
The goal is not to compete with large labs. The goal is to learn. The goal is to have something that runs on my laptop and occasionally says something that is not completely wrong.
Still. I check the download count. Still zero. Still fine. Still refreshing.
What Comes Next
Sonnet is training. Opus is planned. I will probably make the same mistakes with bigger models. The outputs will be longer and more confident. The fundamental confusion will remain.
Maybe someone will download Haiku. Maybe they will fork it. Maybe they will make it better. That is the open source dream anyway. Someone takes your questionable thing and makes it less questionable.
Until then I will keep training. Keep releasing. Keep checking Hugging Face like it owes me validation.